The file contains 3035 data points, each corresponding to one county included in the analyses, and 42 data columns (plus the county identifier, county FIPS) corresponding to the aggregated and de-identified county-level indicators of Facebook Groups usage.

First column in the file is the 5-digit county FIPS code (https://www.nrcs.usda.gov/wps/portal/nrcs/detail/national/home/?cid=nrcs143_013697)

The remaining 42 columns can be grouped under two categories:

Aggregate local group characteristics (6):

group_avg_gender_blau: The average probability that two randomly chosen members of a group having the same gender, computed over the local groups observed in a county. In the paper, we use the complement of this score that represents the gender diversity (1 - group_avg_gender_blau).

group_avg_locale_blau: The average probability that two randomly chosen members of a group having the same locale, computed over the local groups observed in a county. In the paper, we use the complement of this score that represents the locale diversity (1 - group_avg_locale_blau).

group_avg_tie_density: Ratio of actually realized (made) Facebook friendship ties between group members to all possible pairs.

group_avg_iqr_age: Average inter-quartile range of the age distribution of local groups. The inter-quartile range is defined as the number of years between the 25th and 75th percentile values of the empirical age distribution.

control_p_member_gated: Ratio of local groups in a county where the group has at least one of three controls for joining as a new member: admin approval, agreeing to group rules, or questions that are required to be answered for joining

control_p_content_gated: Ratio of local groups in a county where admin approval is required for posts before they become visible to other members.

Participation in different group types (36):

In this category we measure what percentage of active Facebook users in a county participate in different group types by size, privacy, and locality.

First, we partition each group into one of 36 mutually exclusive buckets, organized along three dimensions based on privacy (public / private / hidden), number of users in the group (very small, small, mid-sized, large), and locality (very local, local, non-local). Then we compute the percentage of users living in a county who participate actively as a member in at least one such group. The thresholds and bucket definitions are provided in the paper. 

The columns are named following the <privacy>_<locality>_<size> format, where

privacy: [open, closed, secret] corresponding to public, private, hidden respectively,
locality: [very_local, local, non_local]
size: [large, midsized, small, very_small]

For example, the column secret_very_local_very_small for a county represents the percentage of Facebook users who live in the county and who are active members of at least one very small, private group with a very high locality score.